---
title: "Comparison: TensorZero vs. LiteLLM"
sidebarTitle: "LiteLLM"
description: "TensorZero is an open-source alternative to LiteLLM featuring an LLM gateway, observability, optimization, evaluations, and experimentation."
---

TensorZero and LiteLLM both offer a unified inference API for LLMs, but they have different features beyond that.
TensorZero offers a broader set of features (including observability, optimization, evaluations, and experimentation), whereas LiteLLM offers more traditional gateway features (e.g. budgeting, queuing) and third-party integrations.
That said, **you can get the best of both worlds by using LiteLLM as a model provider inside TensorZero**!

## Similarities

- **Unified Inference API.**
  Both TensorZero and LiteLLM offer a unified inference API that allows you to access LLMs from most major model providers with a single integration, with support for structured outputs, batch inference, tool use, streaming, and more.<br />
  [→ TensorZero Gateway Quickstart](/quickstart/)

- **Automatic Fallbacks for Higher Reliability.**
  Both TensorZero and LiteLLM offer automatic fallbacks to increase reliability.<br />
  [→ Retries & Fallbacks with TensorZero](/gateway/guides/retries-fallbacks/)

- **Open Source & Self-Hosted.**
  Both TensorZero and LiteLLM are open source and self-hosted.
  Your data never leaves your infrastructure, and you don't risk downtime by relying on external APIs.
  TensorZero is fully open-source, whereas LiteLLM gates some of its features behind an enterprise license.

- **Inference Caching.**
  Both TensorZero and LiteLLM allow you to cache requests to improve latency and reduce costs.<br />
  [→ Inference Caching with TensorZero](/gateway/guides/inference-caching/)

- **Multimodal Inference.**
  Both TensorZero and LiteLLM support multimodal inference.<br />
  [→ Multimodal Inference with TensorZero](/gateway/guides/multimodal-inference/)

## Key Differences

### TensorZero

- **High Performance.**
  The TensorZero Gateway was built from the ground up in Rust 🦀 with performance in mind (&lt;1ms P99 latency at 10,000 QPS).
  LiteLLM is built in Python, resulting in 25-100x+ latency overhead and much lower throughput.<br />
  [→ Performance Benchmarks: TensorZero vs. LiteLLM](/gateway/benchmarks/)

- **Built-in Observability.**
  TensorZero offers its own observability features, collecting inference and feedback data in your own database.
  LiteLLM only offers integrations with third-party observability tools like Langfuse.

- **Built-in Evaluations.**
  TensorZero offers built-in evaluation functionality, including heuristics and LLM judges.
  LiteLLM doesn't offer any evaluations functionality.<br />
  [→ TensorZero Evaluations Overview](/evaluations/)

- **Automated Experimentation (A/B Testing).**
  TensorZero offers built-in experimentation features, allowing you to run experiments on your prompts, models, and inference strategies.
  LiteLLM doesn't offer any experimentation features.<br />
  [→ Run adaptive A/B tests with TensorZero](/experimentation/run-adaptive-ab-tests/)

- **Built-in Inference-Time Optimizations.**
  TensorZero offers built-in inference-time optimizations (e.g. dynamic in-context learning), allowing you to optimize your inference performance.
  LiteLLM doesn't offer any inference-time optimizations.<br />
  [→ Inference-Time Optimizations with TensorZero](/gateway/guides/inference-time-optimizations/)

- **Optimization Recipes.**
  TensorZero offers optimization recipes (e.g. supervised fine-tuning, RLHF, MIPRO) that leverage your own data to improve your LLM's performance.
  LiteLLM doesn't offer any features like this.<br />
  [→ Optimization Recipes with TensorZero](/recipes/)

- **Schemas, Templates, GitOps.**
  TensorZero enables a schema-first approach to building LLM applications, allowing you to separate your application logic from LLM implementation details.
  This approach allows your to more easily manage complex LLM applications, benefit from GitOps for prompt and configuration management, counterfactually improve data for optimization, and more.
  LiteLLM only offers the standard unstructured chat completion interface.<br />
  [→ Prompt Templates & Schemas with TensorZero](/gateway/create-a-prompt-template)

- **Access Control.**
  Both TensorZero and LiteLLM support virtual (custom) API keys to authenticate requests.
  LiteLLM offers advanced authentication features in its enterprise plan, whereas TensorZero requires complementary open-source tools like Nginx or OAuth2 Proxy for such use cases.<br />
  [→ Set up auth for TensorZero](/operations/set-up-auth-for-tensorzero)

### LiteLLM

- **Dynamic Provider Routing.**
  LiteLLM allows you to dynamically route requests to different model providers based on latency, cost, and rate limits.
  TensorZero only offers static routing capabilities, i.e. a pre-defined sequence of model providers to attempt.<br />
  [→ Retries & Fallbacks with TensorZero](/gateway/guides/retries-fallbacks/)

- **Request Prioritization.**
  LiteLLM allows you to prioritize requests over others, which can be useful for high-priority tasks when you're constrained by rate limits.
  TensorZero doesn't offer request prioritization, and instead requires you to manage the request queue externally (e.g. using Redis).

- **Built-in Guardrails Integration.**
  LiteLLM offers built-in support for integrations with guardrails tools like AWS Bedrock.
  For now, TensorZero doesn't offer built-in guardrails, and instead requires you to manage integrations yourself.

- **Managed Service.**
  LiteLLM offers a paid managed (hosted) service in addition to the open-source version.
  TensorZero is fully open-source and self-hosted.

<Tip title="Feedback">

Is TensorZero missing any features that are really important to you? Let us know on [GitHub Discussions](https://github.com/tensorzero/tensorzero/discussions), [Slack](https://www.tensorzero.com/slack), or [Discord](https://www.tensorzero.com/discord).

</Tip>

## Combining TensorZero and LiteLLM

You can get the best of both worlds by using LiteLLM as a model provider inside TensorZero.

LiteLLM offers an OpenAI-compatible API, so you can use TensorZero's OpenAI-compatible endpoint to call LiteLLM.
Learn more about using [OpenAI-compatible endpoints](/integrations/model-providers/openai-compatible/).
